22 research outputs found
A Study on Software Testability and the Quality of Testing in Object-Oriented Systems
Software testing is known to be important to the delivery of high-quality systems, but it is also challenging, expensive and time-consuming. This has motivated academic and industrial researchers to seek ways to improve the testability of software. Software testability is the ease with which a software artefact can be effectively tested.
The first step towards building testable software components is to understand the factors – of software processes, products and people – that are related to and can influence software testability. In particular, the goal of this thesis is to provide researchers and practitioners with a comprehensive understanding of design and source code factors that can affect the testability of a class in object oriented systems. This thesis considers three different views on software testability that address three related aspects: 1) the distribution of unit tests in relation to the dynamic coupling and centrality of software production classes, 2) the relationship between dynamic (i.e., runtime) software properties and class testability, and 3) the relationship between code smells, test smells and the factors related to smells distribution. The thesis utilises a combination of source code analysis techniques (both static and dynamic), software metrics, software visualisation techniques and graph-based metrics (from complex networks theory) to address its goals and objectives.
A systematic mapping study was first conducted to thoroughly investigate the body of research on dynamic software metrics and to identify issues associated with their selection, design and implementation. This mapping study identified, evaluated and classified 62 research works based on a pre-tested protocol and a set of classification criteria. Based on the findings of this study, a number of dynamic metrics were selected and used in the experiments that were then conducted.
The thesis demonstrates that by using a combination of visualisation, dynamic analysis, static analysis and graph-based metrics it is feasible to identify central classes and to diagrammatically depict testing coverage information. Experimental results show that, even in projects with high test coverage, some classes appear to be left without any direct unit testing, even though they play a central role during a typical execution profile. It is contended that the proposed visualisation techniques could be particularly helpful when developers need to maintain and reengineer existing test suites.
Another important finding of this thesis is that frequently executed and tightly coupled classes are correlated with the testability of the class – such classes require larger unit tests and more test cases. This information could inform estimates of the effort required to test classes when developing new unit tests or when maintaining and refactoring existing tests.
An additional key finding of this thesis is that test and code smells, in general, can have a negative impact on class testability. Increasing levels of size and complexity in code are associated with the increased presence of test smells. In addition, production classes that contain smells generally require larger unit tests, and are also likely to be associated with test smells in their associated unit tests. There are some particular smells that are more significantly associated with class testability than other smells. Furthermore, some particular code smells can be seen as a sign for the presence of test smells, as some test and code smells are found to co-occur in the test and production code. These results suggest that code smells, and specifically certain types of smells, as well as measures of size and complexity, can be used to provide a more comprehensive indication of smells likely to emerge in test code produced subsequently (or vice versa in a test-first context). Such findings should contribute positively to the work of testers and maintainers when writing unit tests and when refactoring and maintaining existing tests
Flaky Test Sanitisation via On-the-Fly Assumption Inference for Tests with Network Dependencies
Flaky tests cause significant problems as they can interrupt automated build
processes that rely on all tests succeeding and undermine the trustworthiness
of tests. Numerous causes of test flakiness have been identified, and program
analyses exist to detect such tests. Typically, these methods produce advice to
developers on how to refactor tests in order to make test outcomes
deterministic. We argue that one source of flakiness is the lack of assumptions
that precisely describe under which circumstances a test is meaningful. We
devise a sanitisation technique that can isolate f laky tests quickly by
inferring such assumptions on-the-fly, allowing automated builds to proceed as
flaky tests are ignored. We demonstrate this approach for Java and Groovy
programs by implementing it as extensions for three popular testing frameworks
(JUnit4, JUnit5 and Spock) that can transparently inject the inferred
assumptions. If JUnit5 is used, those extensions can be deployed without
refactoring project source code. We demonstrate and evaluate the utility of our
approach using a set of six popular real-world programs, addressing known test
flakiness issues in these programs caused by dependencies of tests on network
availability. We find that our method effectively sanitises failures induced by
network connectivity problems with high precision and recall.Comment: to appear at IEEE International Working Conference on Source Code
Analysis and Manipulation (SCAM
Evil Pickles: DoS Attacks Based on Object-Graph Engineering (Artifact)
This artefact demonstrates the effects of the serialisation vulnerabilities described in the companion paper. It is composed of three components: scripts, including source code, for Java, Ruby and C# serialisation-vulnerabilities, two case studies that demonstrate attacks based on the vulnerabilities, and a contracts-based mitigation strategy for serialisation-based attacks on Java applications. The artefact allows users to witness how the serialisation-based vulnerabilities result in behavior that can be used in security attacks. It also supports the repeatability of the case study experiments and the benchmark for the mitigation measures proposed in the paper. Instructions for running the tasks are provided along with a description of the artefact setup
Evil Pickles: DoS Attacks Based on Object-Graph Engineering
In recent years, multiple vulnerabilities exploiting the serialisation APIs of various programming languages, including Java, have been discovered. These vulnerabilities can be used to devise in- jection attacks, exploiting the presence of dynamic programming language features like reflection or dynamic proxies. In this paper, we investigate a new type of serialisation-related vulnerabilit- ies for Java that exploit the topology of object graphs constructed from classes of the standard library in a way that deserialisation leads to resource exhaustion, facilitating denial of service attacks. We analyse three such vulnerabilities that can be exploited to exhaust stack memory, heap memory and CPU time. We discuss the language and library design features that enable these vulnerabilities, and investigate whether these vulnerabilities can be ported to C#, Java- Script and Ruby. We present two case studies that demonstrate how the vulnerabilities can be used in attacks on two widely used servers, Jenkins deployed on Tomcat and JBoss. Finally, we propose a mitigation strategy based on contract injection
Does class size matter? An in-depth assessment of the effect of class size in software defect prediction
In the past 20 years, defect prediction studies have generally acknowledged
the effect of class size on software prediction performance. To quantify the
relationship between object-oriented (OO) metrics and defects, modelling has to
take into account the direct, and potentially indirect, effects of class size
on defects. However, some studies have shown that size cannot be simply
controlled or ignored, when building prediction models. As such, there remains
a question whether, and when, to control for class size. This study provides a
new in-depth examination of the impact of class size on the relationship
between OO metrics and software defects or defect-proneness. We assess the
impact of class size on the number of defects and defect-proneness in software
systems by employing a regression-based mediation (with bootstrapping) and
moderation analysis to investigate the direct and indirect effect of class size
in count and binary defect prediction. Our results show that the size effect is
not always significant for all metrics. Of the seven OO metrics we
investigated, size consistently has significant mediation impact only on the
relationship between Coupling Between Objects (CBO) and
defects/defect-proneness, and a potential moderation impact on the relationship
between Fan-out and defects/defect-proneness. Based on our results we make
three recommendations. One, we encourage researchers and practitioners to
examine the impact of class size for the specific data they have in hand and
through the use of the proposed statistical mediation/moderation procedures.
Two, we encourage empirical studies to investigate the indirect effect of
possible additional variables in their models when relevant. Three, the
statistical procedures adopted in this study could be used in other empirical
software engineering research to investigate the influence of potential
mediators/moderators.Comment: Accepted to Empirical Software Engineering (to appear). arXiv admin
note: text overlap with arXiv:2104.1234
Assert use and defectiveness in industrial code
The use of asserts in code has received increasing attention in the software engineering community in the past few years, even though it has been a recognized programming construct for many decades. A previous empirical study by Casalnuovo showed that methods containing asserts had fewer defects than those that did not. In this paper, we analyze the test classes of two industrial telecom Java systems to lend support to, or refute that finding. We also analyze the physical position of asserts in methods to determine if there is a relationship between assert placement and method defect-proneness. Finally, we explore the role of test method size and the relationship it has with asserts. In terms of the previous study by Casalnuovo, we found only limited evidence to support the earlier results. We did however find that defective methods with one assert tended to be located at significantly lower levels of the class position-wise than non-defective methods. Finally, method size seemed to correlate strongly with asserts, but surprisingly less so when we excluded methods with just one assert. The work described highlights the need for more studies into this aspect of code, one which has strong links with code comprehension